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A defining feature of non-stationary systems is the time dependence of their statistical parameters. 
Measured time series may exhibit Gaussian statistics on short time horizons, due to the central limit 
theorem. The sample statistics for long time horizons, however, averages over the time-dependent 
parameters. To model the long-term statistical behavior, we compound the local distribution with 
the distribution of its parameters. Here we consider two concrete, but diverse examples of such non- 
stationary systems, the turbulent air flow of a fan and a time series of foreign exchange rates. Our 
main focus is to empirically determine the appropriate parameter distribution for the compounding 
approach. To this end we have to estimate the parameter distribution for univariate time series in 
a highly non-stationary situation. 
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I. INTRODUCTION 

Due to the central limit theorem a great deal of phe¬ 
nomena can be described by Gaussian statistics. This 
also guides our perception of the risks of large deviations 
from an expectation value. Consequently, the occurence 
of any aggravated probability of extreme events is al¬ 
ways cause for concern and subject of intense research 
interest. In a large variety of systems where heavy-tailed 
distributions are observed, Gaussian statistics holds only 
locally — the parameters of the distribution are chang¬ 
ing, either in time or in space. Thus, to describe the 
sample statistics for the whole system, one has to aver¬ 
age the parametric distribution over the distribution of 
the (shape) parameter. This construction is known as 
compounding or mixture chs in the mathematics and as 
superstatistics [4] in the physics literature. 

An important example for parameter distribution func¬ 
tions is the K-distribution mentioned 1978 for the first 
time by Jakeman and Pusey [5]. It was introduced in the 
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context of intensity distributions, and their significance 
for scattering processes of a wide range of length scales 
was stressed. Moreover the distribution is known to be an 
equilibrium solution for the population in a simple birth- 
death-immigration process which was already applied in 
the description of eddy evolution in a turbulent medium. 
The underlying picture of turbulence assumes that large 
eddies are spontaneously created and then ’’give birth“ 
to generations of children eddies, which terminates when 
the smallest eddies die out due to viscous dissipation. 
In [5] Jakeman and Pusey use the K-distribution for fit¬ 
ting data of microwave sea echo, which turned out to be 
highly non-Rayleigh. The K-distribution is also found as 
a special case of a full statistical-mechanical formulation 
for non-Gaussian compound Markov process, developed 
in Tj. Field and Tough find K-distributed noise for the 
diffusion process in electromagnetic scattering [H [9] . Ex¬ 
perimentally the K-distribution appeared in the contexts 
of irradiance fluctuations of a multipass laser beam prop¬ 
agating through atmospheric turbulence cm, synthetic 
aperture radar data [II], ultrasonic scattering from tis¬ 
sues CHS] and mesoscopic systems dj. Also in our 
study we will encounter the K-distribution for one of the 
systems under consideration. 

Compounded distributions can be applied to very dif¬ 
ferent empirical situations: They can describe aggregated 
statistics for many time series, where each time series 
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obeys stationary Gaussian statistics, the parameters of 
which vary only between time series. In this case it 
is straightforward to estimate the parameter distribu¬ 
tion. The situation is more difficult when we consider 
the statistics of single long time series with time-varying 
parameters. In this non-stationary case, one often makes 
an ad hoc assumption about the analytical form of the 
parameter distribution, and only the compounded distri¬ 
bution is compared to empirical findings. 

In this paper we address the problem of determin¬ 
ing the parameter distribution empirically for univariate 
non-stationary time series. Specifically we consider the 
case of Gaussian statistics with time-varying variance. In 
this endeavor we encounter several problems: If the vari¬ 
ance for each time point is purely random, as the com- 
pouding ansatz would suggest, we have no way of deter¬ 
mining the variance distribution from empirical data. A 
prerequisite for an empirical approach to the parameter 
distribution is a time series which is quasi-stationary on 
short time intervals. In other words, the variance should 
vary only slowly compared to the time scale of fluctu¬ 
ations in the signal. The estimation noise for the local 
variances competes with the variance distribution itself. 
Therefore the time interval on which quasi-stationarity 
holds, should not be too short. Furthermore, we have to 
heed possible autocorrelations in the time series them¬ 
selves, since they might lead to an estimation bias for 
the local variances. Our aim is to test the validity of the 
compounding approach on two different data sets. 

The paper is organized as follows: In section [TT] we 
give a short summary of the compounding approach and 
present two recent applications where the K-distribution 
comes into play. In section |III| we introduce the two sys¬ 
tems we are going to analyse, a table top experiment on 
air turbulence and the empirical time series of exchange 
rates between US dollar and Euro. In section Hvl we ad¬ 
dress the problem of estimating non-stationary variances 
in univariate time series. Our empirical results are pre¬ 
sented in section Ivl 

II. THEORETICAL CONSIDERATIONS 

We consider a distribution p(x\a) of d random vari¬ 
ables, ordered in the vector x. It is also a function of a 
parameter a that determines the shape or other features 
of the distribution, e.g. the variance of a Gaussian. If, 
in a given data set, the parameter a varies in an interval 
A , one can try to construct the distribution of x as the 
linear superposition 

(p)(x) = J f(a)p(x\a)da (1) 

A 

of all distributions p(x\a) with a £ A. Here, /(a) is 
the weight function determining the contribution of each 
value of a in the superposition. Since a itself typically is 
a random variable, we assume that the function f(a) is a 


proper distribution. In particular, it is positive semidefi- 
nite. As each p(x\a) and the resulting (p)( x) have to be 
normalized with respect to the random vector x, Eq. 0 
implies the normalization 

J f{a)da = 1. (2) 

A 

The physics reasons for the variation of the parameter 
a can be very different. In non-equilibrium thermody¬ 
namics, a might be the locally fluctuating temperature. 
Although our systems are not of a thermodynamic kind, 
we also have in mind non-stationarities. In recent exper¬ 
iments HU EH we studied the propagation of microwaves 
through an arrangement of disordered scatterers in a cav¬ 
ity. The distribution of the electric fields was measured 
at fixed frequencies as a function of position. Then time- 
dependent wave fields were generated by superposition of 
N = 150 patterns, 

N 

ip(x,y,i) = '^2ip l (x,y)e l{27Tfzt ~ ,pi) . (3) 

i —1 

Here ipi(x,y) is the wave pattern at frequency /j, and 
is a random phase. For fixed positions, always a Rayleigh 
distribution was found in the time sequence ip(x, y, t ) for 
the distribution of intensities /, 

P{! I4d C ) = y 1 - exp(-///i oc ) • (4) 

Hoc: 

This is nothing but a manifestation of the central limit 
theorem. The variance, be., the averaged I\ oc depends 
on the position. The large amount of data made it possi¬ 
ble to extract the distribution of the parameter I\ oc - In 
good approximation, it turned out to be a xt distribu¬ 
tion, see Fig. |T] The number v of degrees of freedom was 
related to the number of independent field components 
and took a value of v = 30. The authors of Refs. HU HU 
then used the compounding ansatz Eq. ([TJ) in the form 

OO 

(P>C0 = J xl{I\oc)P^\hoc) dI \oc > ( 5 ) 

o 

The integral can be done and yields 



where is the modified Bessel function of degree p. 
This is the K-distribution introduced in the introduction. 
Omitting local regions with extremely high amplitudes, 
so-called “hot spots”, the intensity distributions could 
be perfectly well interpreted in terms of K-distributions, 
see Fig. [l] 

We briefly sketch another example which stems from fi¬ 
nance. Details can be found in Ref. H3- The conceptual 
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time windows T of a month or less, £ t is approximately 
constant. The multivariate distribution of the returns 
r k = Tk(t) ordered in the K component vector r at a 
given time t is well described by the Gaussian 

P(r|S,) = V'detp.s,) ° P B rtSr ' r ) - <8) 

The non-stationarity over longer time windows T can be 
modeled by observing that the ensemble of the fluctuat¬ 
ing covariance matrices £ t can be approximated by an 
ensemble of random matrices. In Ref. a Wishart dis¬ 
tribution was assumed for this ensemble. The ensemble 
average of the distribution Q over the Wishart ensemble 
yields 



FIG. 1: (Top) Distribution of the time-averaged intensities 
/j oc found for the 780 pixels of our measurement. The inset 
shows the same data using a semi-logarithmic scale. The solid 
curve is a y 2 distribution with v = 32 degrees of freedom. 
(Bottom) Intensity distribution for the time-dependent wave 
patterns generated by Eq. The solid (red) line is given 
by Eq. (|6|. 


difference to the previous example is that the compound¬ 
ing formula ([I]) appears as an intermediate result, not 
as an ansatz. We consider a selection of K stocks with 
prices Sk(t), k = 1,..., K belonging to the same market. 
One is interested in the distribution of the relative price 
changes over a fixed time interval At, also referred to as 
returns 


rkit) = 


S k (t + At) - S k (t) 

S k (t) 


(7) 


As the companies operate on the same market, there are 
correlations between the stocks which have to be mea¬ 
sured over a time window T much larger than At. The 
corresponding covariances form the K x K matrix £ t . 
The business relations between the companies as well 
as the market expectations of the stock market traders 
change in time. Thus, the market is non-stationary and 
the correlation coefficients fluctuate in time. Only for 


(p)(r|£, N) 


Xn( z )p ( j 


N 


dz 


(9) 


where £ is the sample-averaged covariance matrix over 
the entire time window T. As £ is fixed, this result has 
the form of the compounding ansatz ([!]). Furthermore, 
it closely matches the result (|5j) found in the context 
of microwave scattering. The number N of degrees of 
freedom in the x'n distribution determines the variance in 
the distribution of the random covariance matrices. The 
role of the locally averaged intensity Ij oc is now played 
by an effective parameter 2 which fully accounts for the 
ensemble average. Again, a K-distribution follows, 


(p)(r|£,A) = 


2 N / 2 + 1 r(N/2)y/det(2nT,/N) 
IC(k-n )/2 (VNr^T,- 1 ^ 




(K-N)/2 


( 10 ) 


in which the bilinear form r^£ -1 r takes the place of the 
intensity I in Eq. ([b]). In both examples, averages over 
fluctuating quantities produce heavy-tailed distributions 
which describe the vast majority of the large events. 


III. DATA ACQUISITION 
A. Turbulent air flow 

The first data set is obtained by measuring the noise 
generated by a turbulent air flow. For the turbulence gen¬ 
eration we used a standard fan with a rotor frequency of 
18.44 Hz. We restricted ourselves to standard audio tech¬ 
nique handling frequencies up to 20 kHz and standard 
sampling rates of 48 kHz offering reliable quality at an 
attractive price. The microphone for the sound record¬ 
ing is a E 614 by Sennheiser with a frequency response 
of 40 Hz - 20 kHz, a good directional characteristic and 
a small diameter of 20 mm. It guarantees a broadband 
frequency resolution and a point-like measuring position. 
An external sound card with matching properties was 
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FIG. 2: Setup for the generation and measurement of the tur¬ 
bulent air flow. The turbulent air flow has been generated by 
the fan on the left hand side and recorded by the microphone 
in front of it. In the preliminary stage of the experiments the 
sound card of the laptop has been used for data acquisition. 


necessary to use the full capacity of the quality of micro¬ 
phone and to minimize the influence of the intrinsic noise 
of the PC. Fig. [2] shows a photograph or the used setup. 

A microphone has been placed in front of a fan running 
continuously and generating a highly turbulent air flow. 
The microphone records the time signal of the sound 
waves excited by the turbulence. The details about the 
analysed time series will be discussed in section m 


B. Foreign exchange rates 


The foreign exchange markets have the peculiar fea¬ 
ture of all-day continuous trading. This is in contrast to 
stock markets, where the trading hours of different stock 
exchanges vary due to time zones, with partial overlap of 
different markets and very peculiar trading behavior at 
the beginning and the end of each trading day. There¬ 
fore foreign exchange rates are particularly suited for the 
study of long time series. 

We consider the time series of hourly exchange rates 
between Euro and US dollar in the time period from 
January 2001 to May 2013. The empirical data were 
obtained from www.fxhistoricaldata.com. We denote 
the time series of exchange rates by 5(f). From these 
we calculate the time series of returns, i.e., the relative 
changes in the exchange rates on time intervals At, 


r(t) 


S(t + At) - 5(f) 


(ii) 


Since we work with hourly data, the smallest possible 
value for A< is one hour. However, as we will see later 
on, a return interval of one trading day, At = Id, is 
preferable for the variance estimation. Note that foreign 


exchange rates are typically modeled by a multiplicative 
random process, such as a geometric Brownian motion, 
see, e.g., ns- Therefore we consider the relative changes 
of the exchange rates instead of the exchange rates them¬ 
selves. While the latter resemble - at least locally - a log¬ 
normal distribution, the returns are approximately Gaus¬ 
sian, conditioned on the local variance, that is. 


IV. NON-STATIONARY VARIANCES IN 
UNIVARIATE TIME SERIES 


We consider the problem of univariate time series with 
time-dependent variance. More specifically, we consider 
time series where the variance is changing, but exhibits 
a slowly decaying autocorrelation function. This point 
is crucial, because otherwise it is not possible to make 
meaningful estimates of the local variances. Time series 
with this feature show extended periods of large fluctua¬ 
tions interupted by periods with moderate or small fluc¬ 
tuations. This is illustrated in Fig.[3]for the two data sets 
we are studying in this paper. In the top plot of Fig. [3j 
we show the sound signal for the ventilator measurement. 
In the bottom plot, the time series of daily returns for the 
foreign exchange data is plotted. In both cases we ob¬ 
serve the same qualitative behavior, which is well-known 
in the finance literature as volatility clustering. 

The compounding ansatz for univariate time series 
assumes a normal distribution on short time horizons, 
where the local variance is nearly stationary. However, 
we wish to determine the distribution of the local vari¬ 
ances empirically, since it is a critical part in the com¬ 
pounding ansatz. If the variances were fluctuating with¬ 
out a noticable time-lagged correlation, this would not be 
feasible. Still, we need to establish the right time hori¬ 
zon on which to estimate the local variances. Ref. [19] 
introduced a method to locally normalize time series with 
autocorrelated variances. To this end, a local average was 
subtracted from the data and the result was divided by 
a local standard deviation. In this spirit, we determine 
the time horizon on which this local normalization yields 
normal distributed values and analyse the corresponding 
local variances. 

Another aspect we need to take into account is a possi¬ 
ble bias in the variance estimation which occurs for cor¬ 
related events. In Fig. [4] we show the autocorrelation 
function (ACF) of the measured sound signal, as well 
as the autocorrelation function of the absolute value of 
hourly returns. Both plots hint at possible problems for 
the variance estimation. Due to the high sampling fre¬ 
quency, the sound signal is highly correlated. In other 
words, the sampling time scale is much shorter than the 
time scale on which the turbulent air flow changes. Af¬ 
ter 2500 data points, or about 52 ms, the autocorrelation 
function has decayed to zero. Consequently, we consider 
only every 2500th data point for our local variance es¬ 
timation. To improve statistics, we repeat the variance 
estimation starting with an offset of 1 to 2499. The re- 
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FIG. 3: (Top) Sound signal of the ventilator measurement. 
(Bottom) Time series of daily returns. Both signals have been 
normalized to mean zero and standard deviation one. In both 
cases we observe extended periods of low and high fluctuation 
strength. 


suits are presented in the following section. 

In the case of the foreign exchange data we are con¬ 
fronted with a different problem. The consecutive hourly 
returns are not correlated. While local trends may always 
exist, it is unpredictable when a positive trend switches 
to a negative one, and vice versa , see Ref. [20]. However, 
the autocorrelation of the absolute values shows a rich 
structure which is due to characteristic intraday variabil¬ 
ity. This would lead to a biased variance estimation and, 
consequently, to a distortion of the variance distribution. 
Therefore we consider returns between consecutive trad¬ 
ing days at the same hour of the day. Put differently, 
we consider At = 1 d for the returns and get 24 different 
time series, one for each hour of the day as starting point. 


V. EMPIRICAL RESULTS 

We first discuss the results for the turbulent air flow. 
As described in the previous section, we sliced the sin¬ 
gle measurement time series into 2500 time series with 
lower sampling rate, taking only every 2500th measure¬ 
ment point. This is necessary to avoid a bias in the es¬ 
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FIG. 4: (Top) Autocorrelation function of the measured 
sound signal. (Bottom) Autocorrelation function of the abso¬ 
lute values of hourly returns. 


timation of the local variances. Before proceeding, each 
of these time series is globally normalized to mean zero 
and standard deviation one. Figure [5] shows the empiri¬ 
cal results for the distribution of local variances and the 
compounded distribution. The local variances are rather 
well described by a x 2 distribution with N degrees of 
freedom. We find N = 10 to provide the best fit to 
the data. The distribution of the empirical sound ampli¬ 
tudes is well described by a K-distribution with the same 
N which fits the variance distribution. Hence, we arrive 
at a consistent picture, which supports our compounding 
ansatz for this measurement. 

The results for the daily returns of EUR-USD foreign 
exchange rates are shown in Fig. [6] As outlined in sec¬ 
tion [TV] we calculated the daily returns as the relative 
changes of the exchange rate between consecutive trad¬ 
ing days with respect to the same hour of each day. This 
procedure yields 24 time series of daily returns. We nor¬ 
malize each time series to mean zero and standard devi¬ 
ation one. This allows us to produce a single aggregated 
statistics. In the top plot of Fig. [6] we show the histogram 
of local variances, i.e. the variances estimated on 13-day 
intervals. In accordance with the finance literature, the 
empirical variances follow a lognormal distribution over 
almost three orders of magnitude, with only some devi- 
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FIG. 5: Results for the turbulent air flow. (Top) Distribution 
of variances, compared to a ^^distribution with N degrees 
of freedom. (Bottom) Distribution of the sound amplitudes, 
compared to the K-distribution with parameter TV = 10. 


ations in the tail. The histogram of the daily returns is 
shown in the bottom plot of Fig. [6] The empirical re¬ 
sult agrees rather well with the normal-lognormal com¬ 
pounded distribution. It is important to note, however, 
that we only achieve this consistent picture of variance 
and compounded return distribution because we have 
taken into account all the pitfalls of variance estimation, 
which we described in section m 


VI. CONCLUSIONS 

We applied the compounding approach to two different 
systems, a ventilator setup generating turbulent air flow 
and foreign exchange rates. Both systems are character¬ 
ized by univariate time series with non-stationary vari¬ 
ances. Our main objective was to empirically determine 
the distribution of variances and thus arrive at a consis¬ 
tent picture. The estimation of variances from a single, 
non-stationary time series presents several pitfalls, which 
have to be taken into account carefully. First of all, we 
have to avoid serial correlations in the signal itself. These 
might otherwise lead to an estimation bias. For the sound 
measurement, we had to reduce the sampling rate of the 




returns 


FIG. 6: Results for daily returns of foreign exchange rates 
EUR-USD. (Top) Distribution of variances, compared to a 
lognormal distribution. (Bottom) Distribution of the daily 
returns, compared to a lognormal compounded distribution 
(red). 


data to achieve this. The foreign exchange data presented 
another obstacle for variance estimation: We observed a 
characteristic intraday variability which had to be taken 
into account. Last but not least, it is a prerequisite that 
the non-stationary variances are not purely stochastic, 
but exhibit a slowly decaying autocorrelation. Otherwise 
we would not be able to determine a reasonable variance 
distribution for the compounding ansatz. When we take 
all these aspects into account, we arrive at the correct 
variance distribution. In good approximation we found a 
X 2 distribution in the case of ventilator turbulence, which 
leads to a K-distribution for the compounded statistics. 
For the foreign exchange returns we observe lognormal 
distributed variances; and the normal-lognormal com¬ 
pounded distribution fit the return histogram well. A 
central assumption in the compounding ansatz is the sta- 
tionarity of the variance distribution. This assumption 
might not always be satisfied and lead to deviations from 
the compounded distribution. 
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